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DETAILED ACTION 
Response to Amendment 

1 . The objection to claims 47 and 50 are withdrawn in view of the amendment filed 
12/05/05. 

2. The rejection of claims 1-39, 47 and 50 are withdrawn in view of the amendment 
filed 12/05/05. 

3. Applicant's arguments filed 12/05/05 have been fully considered but they are not 
persuasive. Applicant states that Sun does not teach, "wherein the application- 
independent interface and the engine-independent interface are different interfaces". 
The Examiner respectfully disagrees. Sun teaches a speech API to control the 
interfacing between an application and a speech-related engine. As outlined in section 
4.1 of Sun, the system identifies the application's functional requirements for an engine, 
locates and creates a specified engine, allocates resources to this engine, sets up the 
engine, performs operations with the engine and finally deallocates the resources of the 
engine. As shown, the speech API communicates with both the application and the 
engine. Therefore, because the application and the engine have different 
communication functionalities, the speech API must comprise both an application 
interface component and an engine interface component in order to control these 
interactions. Being that these interface components would be specific to either the 
application or the engine they would inherently be different. Therefore, the new 
rejection is given below, necessitated by amendment. 
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4. In response to the traversal of the Official Notice given in claims 4, 8 and 53, the 
Examiner respectfully submits the requested references. 

As per claims 4 and 53, Tewfik et al. (U.S. Pat. Pub. 2004/0 174996A1) teaches 
an audio output device that can output audio in various formats based upon the format 
of the audio file hence the data format of the audio output device is changed when a 
differently formatted file is obtained (paragraph 23). 

As per claim 8, Sugiyama et al. (U.S. Pat. 6,345,245) teaches a system for 
managing a common dictionary as well as user dictionaries where the user dictionaries 
are updated by loading entries from the common dictionary (col. 2, lines 34-54). 

Claim Rejections - 35 USC § 102 

5. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

6. Claims 1-3, 6, 7, 10-39, 51 and 52 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Sun Microsystems ("Java Speech API Programmer's Guide"), herein 
referred to as Sun. 

As per claim 1, Sun teaches a middleware layer configured to facilitate 
communication between a speech-related application and a speech-related engine, 
comprising: 
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a speech component having an application-independent interface (Java Speech API 
can be used in a wide range of applications such as a voice response system, a 
dictation system and speech technology for the handicapped, hence application- 
independent, section 1.1) configured to be coupled to the application and an engine- 
independent interface (includes a standard API for speech so users can choose the 
speech products to use that best meet their needs, wherein these speech products are 
speech engines such as speech synthesizers and speech recognizers, hence engine- 
independent, sections 1.1 and 4.1) configured to be coupled to the engine and at least 
one processing component configured to perform speech related services for the 
application and the engine (provides support and interface between the applications and 
the speech engines, section 1.2), wherein the application-independent interface and the 
engine-independent interface are different interfaces (speech API communicates with 
both the application and the engine, therefore it must inherently have both an 
application interface component and an engine interface component, section 4.1). 

7. As per claim 2, Sun teaches the speech component includes a plurality of 
processing components associated with a plurality of different processes, wherein the 
speech component further comprises: a marshaling component, configured to access 
at least one processing component in each process and to marshal information transfer 
among the processes (Speech API has multiple processes that would inherently have a 
component to transfer the data between the processes, section 4.1). 

8. As per claims 3 and 6, Sun teaches a format negotiation component configured 
to negotiate a data format of data used by the audio device and data used by the engine 



Application/Control Number: 09/751 ,836 Page 5 

Art Unit: 2655 

wherein the format negotiation component is configured to invoke a format converter to 
convert the data format of data between the engine and the audio device to a desired 
format based on the data format used by the audio device and the data format used by 
the engine (must inherently convert the queued digital signal created by the synthesizer 
into an analog signal to be output through the speaker, section 5.4). 

9. As per claim 7, Sun teaches a lexicon container object configured to contain a 
plurality of lexicons and to provide a lexicon interface to the engine to represent the 
plurality of lexicons as a single lexicon to the engine and load the one or more user 
lexicons as one or more application lexicons (vocabulary manager adds the list of words 
used by the recognizer, section 4.6.3). 

10. As per claims 10, 36 and 37, Sun teaches a site object having an interface 
configured to receive result information, indicative of recognized speech from the engine 
(results from synthesizer are placed in a queue and result interface obtains recognition 
results, sections 5.4 and 6.7.3.1). 

11. As per claim 1 1 , Sun teaches the engine comprises a TTS engine (speech 
synthesis, section 5) and wherein the processing component comprises a first object 
having an application interface and an engine interface (text to be synthesized is 
inputted through the application hence these interfaces exist, section 5.3). 

12. As per claim 12, Sun teaches the application interface exposes a method 
configured to receive engine attributes from the application and instantiate a specific 
engine based on the engine attributes received (application creates and allocates 
information to the synthesizer, section 5.1). 
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13. As per claim 13, Sun teaches wherein the application interface exposes a 
method configured to receive audio device attributes from the application and instantiate 
a specific audio device based on the audio device attributes received (receives 
indication from the application regarding the volume, speech rate and prosody of the 
speech to be outputted by the audio device, section 5.6). 

14. As per claim 14, Sun teaches a parser to receive input data to be synthesized 
and parse the input data into text fragments (formats the inputted text for synthesis, 
sections 2.1 and 5.3). 

15. As per claim 15, Sun teaches the engine interface is configured to call a method 
exposed by the engine to begin synthesis (property change events inherent in an 
engine call a method to change properties, section 5.6). 

16. As per claim 16, Sun teaches the engine comprises a speech recognition engine 
(section 6) and wherein the processing component comprises a first object having an 
application interface and an engine interface (application provides grammars to the 
engine so this interface must exist, section 6.1). 

17. As per claim 17, Sun teaches wherein the application interface exposes a 
method configured to receive recognition attributes from the application and instantiates 
a specific speech recognition engine based on the engine attributes received 
(application creates a grammar and defines the grammar, section 6.1). 

18. As per claim 18, Sun teaches wherein the application interface exposes a 
method configured to receive audio device attributes from the application and instantiate 
a specific audio device based on the audio device attributes received (receives 
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indication from the application regarding the volume, speech rate and prosody of the 
speech to be outputted by the audio device, section 5.6). 

19. As per claim 19, Sun teaches wherein the application interface exposes a 
method configured to receive an alternate request from the application and to configure 
the speech component to retain alternates provided by the SR engine for transmission 
to the application based on the alternate request (recognizer creates alternate 
recognition results and application specifies the number of alternates to return, sections 
6.7.9.2 and 6.7.10). 

20. As per claim 20, Sun teaches the application interface exposes a method 
configured to receive an audio information request from the application and to configure 
the speech component to retain audio information recognized by the SR engine based 
on the audio information request (recognizer performs language model adaptation 
based upon the input from the application, section 6.6). 

21 . As per claim 21 , Sun teaches wherein the application interface exposes a 
method configured to receive bookmark information from the application identifying a 
position in an input data stream being recognized and to notify the application when the 
SR engine reaches the identified position (attaches a ResultListener to receive event 
progress information during recognition and uses a marker-reaches event in audio 
output hence suggesting its use in recognition, sections 5.5 and 6.1). 

22. As per claim 22, Sun teaches the engine interface is configured to call the SR 
engine to set acoustic profile information in the SR engine (recognizer receives a profile 
that sets the acoustic models that are used, section 6.9). 
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23. As per claim 23, Sun teaches the engine interface is configured to call the SR 
engine to load a grammar in the SR engine (grammar interface creates and enables 
grammars, section 6.4.1). 

24. As per claim 24, Sun teaches the engine interface is configured to call the SR 
engine to load a language model in the SR engine (profile loaded into the recognizer 
contains the language model, section 6.9). 

25. As per claims 25 and 27, Sun teaches wherein the application interface exposes 
a method configured to receive a grammar request from the application and to 
instantiate a grammar object based on the grammar request to be used by the SR 
engine (application loads and enables the grammars in the engine, section 6.1). 

26. As per claim 26, Sun teaches the grammar object includes a word sequence data 
buffer (grammar format includes a rule defining the sequence of allowable words, 
section 2.2.1). 

27. As per claim 28, Sun teaches the grammar includes words, rules and transitions 
and wherein the grammar object includes an application interface and an engine 
interface (grammar includes rules, words and transitions that is controlled by the 
grammar interface, section 6.4.1 and 6.5.3). 

28. As per claim 29, Sun teaches wherein the application interface exposes a 
grammar configuration method configured to receive grammar configuration information 
from the application and configure the grammar based on the grammar configuration 
information (Speech API supports dynamic grammars that can be changed at any time, 
section 6.4.2). 
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29. As per claim 30, Sun teaches grammar configuration method is configured to 
receive word change data, rule change (activation/deactivation) data, and transition 
change data and change words, rules and transitions in the grammar in the grammar 
object based on the grammar received (setEnable method enables and disables rules, 
section 6.5.1). 

30. As per claim 31 , Sun teaches wherein the grammar configuration method is 
configured to receive grammar activation information and enable or disable grammars in 
the grammar object based on the grammar activation information (grammar interface 
enables and disables grammars, section 6.4.1). 

31 . As per claim 32, Sun teaches to change words, rules and transitions in the 
grammar in the grammar object based on the grammar received (commands to change 
rules in the grammar and allows for dynamic grammars, sections 6.5.1 and 6.5.4). 

32. As per claims 33-35, Sun teaches the engine interface is configured to call the 
SR engine to load the grammar in the SR engine, wherein the call updates a 
configuration of the grammar or activation state in the SR engine (grammar interface 
enables, updates and activates grammars, sections 6.4.1 and 6.4.2). 

33. As per claim 38, Sun teaches wherein the engine interface on the site object is 
configured to receive update information from the SR engine indicative of a current 
position of the SR engine in an audio input stream to be recognized (attaches a 
ResultListener to indicate the progress of recognition, section 6.1). 

34. As per claim 39, Sun teaches a result object configured to obtain the result 
information from the site object and expose an interface configured to pass the result 
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information to the application (result interfaces receive results and provide them to the 
application, section 6.7.3). 

35. As per claim 51 , Sun teaches a method of updating a grammar configuration of a 
grammar used by a speech recognition engine based on update information from an 
application, comprising: 

calling a first object in an application-independent, engine-independent, 
middleware layer, between the SR engine and the application (Java Speech API can be 
used in a wide range of applications, hence application-independent and includes a 
standard API for speech so users can choose the speech products to use that best 
meet their needs hence engine-independent, section 1.1), with a pause request 
(suspended state, section 6.3.3); 

delaying return from the first object on a subsequent call from the SR engine 
(delays the return to the listening state until grammar updating is complete, section 
6.3.3); 

receiving the update information from the application at the middleware layer 
(grammars are updated in response to the user input which would come from the 
application, section 6.3.3); 

passing the update information from the middleware layer to the SR engine 
(grammars in the recognizer are updated, section 6.3.3); and 

returning on the subsequent call from the SR engine (returns to the listening 
state, section 6.3.3). 
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36. As per claim 52, Sun teaches to change words, rules and transitions in the 
grammar in the grammar object based on the grammar received (commands to change 
rules in the grammar and allows for dynamic grammars, sections 6.5.1 and 6.5.4). 



Claim Rejections - 35 USC § 103 

37. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

38. Claims 4, 53 and 54 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Sun in view of Tewfik et al. (U.S. Pat. Pub. 2004/01 74996A1). 

As per claim 4, Sun does not specifically teach the format negotiation component 
is configured to reconfigure the audio device to change the data format of the data used 
by the audio device. 

Tewfik teaches an audio output device that can output audio in various formats 
based upon the format of the audio file hence the data format of the audio output device 
is changed when a differently formatted file is obtained (paragraph 23). 

It would have been obvious to one or ordinary skill in the art at the time of 
invention to modify the system of Sun to reconfigure the audio device to change the 
data format of the data used by the audio device as taught by Tewfik because it would 
enable the audio device to stay consistent with the speech engine. 
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39. As per claim 53, Sun teaches a method of formatting data for use by a speech 
engine and an audio device, comprising: 

obtaining, at a middleware layer which facilitates communication between the 
speech engine and an application, a data format for data used by the engine; obtaining, 
at the middleware layer, a data format of data used by the audio device; determining, at 
the middleware layer, whether the engine data format and the audio data format are 
consistent; and if not utilizing the middleware layer to attempt to change the data format 
of the data (middleware would inherently know the data formats used by the audio 
device and engine to output the speech and would inherently convert the queued digital 
signal created by the synthesizer into an analog signal to be output through the 
speaker, section 5.4). 

Sun does not specifically teach the format negotiation component is configured to 
reconfigure the audio device to change the data format of the data used by the audio 
device. 

Tewfik teaches an audio output device that can output audio in various formats 
based upon the format of the audio file hence the data format of the audio output device 
is changed when a differently formatted file is obtained (paragraph 23). 

It would have been obvious to one or ordinary skill in the art at the time of 
invention to modify the system of Sun to reconfigure the audio device to change the 
data format of the data used by the audio device as taught by Tewfik because it would 
enable the audio device to stay consistent with the speech engine. 
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40. As per claim 54, Sun teaches invoking a format converter to convert the data 
format of data between the engine and the audio device to a desired format based on 
the data format used by the audio device and the data format used by the engine (must 
inherently convert the queued digital signal created by the synthesizer into an analog 
signal to be output through the speaker, section 5.4). 

41. Claims 8 and 9 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sun in view of Sugiyama et al. (U.S. Pat. 6,345,245). 

As per claim 8, Sun does not specifically teach loading one or more user lexicons 
and one or more application lexicons from a lexicon data store. 

Sugiyama teaches a system for managing a common dictionary as well as user 
dictionaries where the user dictionaries are updated by loading entries from the 
common dictionary (col. 2, lines 34-54). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Sun to load both user lexicons and application 
lexicons into the engine as taught by Sugiyama because using concentrated 
vocabularies give better recognition results. 

42. As per claim 9, Sun teaches the lexicon interface is configured to be invoked by 
the engine to add a lexicon provided by the engine (Vocabmanager provided by the 
engine, section 4.6.3). 
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Allowable Subject Matter 

43. Claims 47 and 50 are allowed. 

44. Claim 5 is objected to as being dependent upon a rejected base claim, but would 
be allowable if rewritten in independent form including all of the limitations of the base 
claim and any intervening claims. 

Conclusion 

45. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Boyle, III et al. (U.S. Pat. 5,717,747) teaches a system with a 
middleware layer that comprises both a distinct interface for the application and a 
distinct interface for the engine. 

46. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 

§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 



Application/Control Number: 09/751,836 



Page 15 



Art Unit: 2655 

than SIX MONTHS from the date of this final action. Any inquiry concerning this 
communication or earlier communications from the examiner should be directed to 
Matthew J. Sked whose telephone number is (571) 272-7627. The examiner can 
normally be reached on Mon-Fri (8:00 am - 4:30 pm). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on (571) 272-7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



MS 

02/07/06 




